feat: Add nested hypergraph generators (Kim et al. 2023, Barrett et al. 2025) by jg-you · Pull Request #683 · xgi-org/xgi

jg-you · 2026-02-19T05:25:19Z

Adds two generative models that explicitly parameterize nestedness: random_nested_hypergraph and simplicial_chung_lu_hypergraph.

Barrett et al.: Implementation of Algorithms 3 and 4 from the paper. Couldn't find much to re-use between this and chung_lu_hypergraph since the methods are fundamentally different.

Kim et al.: Implements the algorithm described in Section II here. Duplicate edges are prevented during facet generation (step 1) and after rewiring (step 3) using frozenset sets. I wasn't sure whether epsilon was fixed or could vary by order, so I made this argument a list or a float to support both cases.

Notes

Tried to match the codebase conventions for parameter names: d for edge size, p for probability, epsilon for retention, k1/k2 for degree/size sequences.
Seed reproducibility tests only assert same-seed determinism. There's a bug in the existing code, e.g., here, where different seeds could lead to the edges being identical nonetheless. Did not reproduce the pattern in the new tests as a result.

maximelucas · 2026-03-04T17:27:36Z

Thanks @jg-you !

Quick dispatching for a few points:

@leotrs, a test in stats is failing, can you have a look?
the test-bug Jean-Gab mentioned is related to how we check equality between two hypergraphs, if I understand correctly. Since then, we improve this in Improved the hypergraph equality method #671. @nwlandry you did this, can you check if we can update the test with our new equalities?

I can review random_nested_hypergraph, maybe @nwlandry you wanna review simplicial_chung_lu_hypergraph since you know chung lu better? Or anyone else

leotrs · 2026-03-05T05:17:45Z

The failing test (test_perfectly_separable_low_dimensions) was already fixed on dev — the assertions were loosened to check core cluster membership rather than exact cluster sizes (which vary across platforms due to ARPACK/LAPACK differences in eigsh).

Merging dev into the nestedness branch should resolve the CI failures.

jg-you · 2026-03-10T02:39:59Z

Fixed, should be g2g

xgi/generators/random.py

maximelucas · 2026-03-12T14:10:23Z

xgi/generators/random.py

+
+    # Step 1: Generate m unique facets of size d
+    facets = set()
+    while len(facets) < m:


This may go into an infinite loop if m is large and d large compared to N right?
If so, add some checks earlier on and throw an error to prevent this

That's fair! Actual condition is m > (n choose d).

How much optimization is worth it to the library?
I could implement sampling non-edges if m > (n choose d) // 2, which would give good performance to this algorithm in the sparse and dense regimes. Maybe there's another quick win for m \approx (n choose d) // 2.

xgi/generators/random.py

maximelucas · 2026-03-12T14:19:46Z

Ok I reviewed random_nested_hypergraph. Left a few comments, mainly to adhere to the new guidelines for random number generators, which we recently adopted #689.

jg-you · 2026-03-16T19:23:41Z

Ok I reviewed random_nested_hypergraph. Left a few comments, mainly to adhere to the new guidelines for random number generators, which we recently adopted #689.

Ty, all implemented except for the one comment I have a question about -- level of optimization desired.
Main tradeoff being code complexity vs speed.

Random models with controllable nestedness

9bddfce

kaiser-dan added the new feature New feature or request label Feb 26, 2026

kaiser-dan changed the base branch from main to dev February 26, 2026 15:42

jg-you added 2 commits March 9, 2026 22:27

Merge remote-tracking branch 'upstream/dev' into nestedness

bcc5184

Fix imports after merging dev; simplify comments

112daad